22 research outputs found

    Run-Based Semantics for RPQs

    Full text link
    The formalism of RPQs (regular path queries) is an important building block of most query languages for graph databases. RPQs are generally evaluated under homomorphism semantics; in particular only the endpoints of the matched walks are returned. Practical applications often need the full matched walks to compute aggregate values. In those cases, homomorphism semantics are not suitable since the number of matched walks can be infinite. Hence, graph-database engines adapt the semantics of RPQs, often neglecting theoretical red flags. For instance, the popular query language Cypher uses trail semantics, which ensures the result to be finite at the cost of making computational problems intractable. We propose a new kind of semantics for RPQs, including in particular simple-run and binding-trail semantics, as a candidate to reconcile theoretical considerations with practical aspirations. Both ensure the output to be finite in a way that is compatible with homomorphism semantics: projection on endpoints coincides with homomorphism semantics. Hence, testing the emptiness of result is tractable, and known methods readily apply. Moreover, simple-run and binding-trail semantics support bag semantics, and enumeration of the bag of results is tractableComment: 35 page

    A Researcher’s Digest of GQL

    Get PDF
    International audienceGQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL

    Graph Pattern Matching in GQL and SQL/PGQ

    Get PDF
    As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to define graph views over an SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose task is to produce a formal semantics of these graph data languages, which complements their standard specifications. This paper, written by members of WG3 and LDBC, presents the key elements of the GPML of SQL/PGQ and GQL in advance of the publication of these new standards

    PG-Schema: Schemas for Property Graphs

    Get PDF
    Property graphs have reached a high level of maturity, witnessed by multiple robust graph database systems as well as the ongoing ISO standardization effort aiming at creating a new standard Graph Query Language (GQL). Yet, despite documented demand, schema support is limited both in existing systems and in the first version of the GQL Standard. It is anticipated that the second version of the GQL Standard will include a rich DDL. Aiming to inspire the development of GQL and enhance the capabilities of graph database systems, we propose PG-Schema, a simple yet powerful formalism for specifying property graph schemas. It features PG-Types with flexible type definitions supporting multi-inheritance, as well as expressive constraints based on the recently proposed PG-Keys formalism. We provide the formal syntax and semantics of PG-Schema, which meet principled design requirements grounded in contemporary property graph management scenarios, and offer a detailed comparison of its features with those of existing schema languages and graph database systems.Comment: 25 page

    Énumération et numération

    No full text
    This memoir involves several domains of discrete mathematics and theoretical computer science, such as formal languages, numeration, combinatorics on words, algorithmic, complexity, etc. In summary, various problems, all from the general area of numeration, are addressed by means of automata and transducers theory. We first consider integer base numeration systems. Given as a parameter an integer base b, we give a quasi-linear and structural algorithm to decide whether the language accepted by a given automaton is the set of the representations (in base b) of an ultimately periodic set of integers.Second, we consider the rational base p/q and particularly the language L_p/q of the representations of integers in this base. It is a quite complex language according to the usual criteria: in particular, it has a property called FLIP (for Finite Left Iteration Property) which implies that L_p/q does not satisfies any kind of pumping lemma. We prove that, if a monoid M is finitely generated and contains only numbers that are representable in base p/q, then the language of all the representations of the numbers of M possesses the FLIP property. We then study L_p/q from a different perspective: with every integer is associated an infinite word called minimal and we consider the function that maps the minimal word associated with n to the minimal word associated with (n+1); we show that this function is realised by an infinite transducer whose structure is virtually the same as the one of L_p/q. We finally describe a way to serialise a infinite tree and language into an infinite word, called signatures, by means of a breadth-first traversal. We first note that the signatures of regular languages form a subclass of morphic words, a result linked to the classical transformation automaton/word morphism. We then treat the case of periodic signatures and show their intrinsic relationship with rational base numeration systems: for every base p/q the language L_p/q has a periodic signature; given a finite sequence r of integer (that we call rhythm) the signature r^ω generates a language that is a non-canonical way to represent the set of all integers in base p/q, where p/q is the average of components of r. The notion of signature allows us to define an automaton transformation, called surminimisation, that reduces the number of states of the input automaton, more so than a classical minimisation. However, whereas an automaton and its minimisation accept the same language, it is in general not the case for an automaton and its surminimisation: the surminimisation process indeed preserves only the underlying ordered tree.Ce mémoire aborde et résout des problèmes assez différents, ayant tous trait à la numération, avec une certaine unité conceptuelle quant aux moyens mis en œuvre pour les résoudre: la théorie des automates. Nous considérons d'abord les bases entières et présentons un algorithme quasi-linéaire et structurel permettant de décider si le langage accepté par un automate donné est la représentation d'un ensemble ultimement périodique d'entiers. Ensuite, nous étudions la base rationnelle p/q et particulièrement le langage L_p/q des représentations des entiers dans cette base. Il s'agit d'un langage relativement complexe selon la théorie classique des langages formels : il ne satisfait aucune forme de lemme d’itération. Nous montrons que chaque monoïde finiment engendré est représenté par un langage aussi complexe que L_p/q. Nous prenons ensuite une perspective différente pour étudier L_p/q : à chaque entier est associé un mot infini, dit minimal, et l'on étudie la fonction qui associe le mot minimal d'un entier n à celui de son successeur (n+1) ; nous montrons en particulier que cette fonction est réalisée par un transducteur infini dont la structure est très proche de celle du langage L_p/q.Enfin, nous décrivons une manière de sérialiser les arbres infinis et les langages en des mots, appelés signatures, par le moyen d'un parcours en largeur. On remarque d'abord que les langages réguliers sont associés aux mots morphiques, ce qui rejoint le lien entre les systèmes de numération abstraits réguliers et les systèmes de numération morphiques (aussi dit de Dumont-Thomas). On traite ensuite le cas des signatures périodiques et l'on montre qu'elles sont liées aux bases rationnelles ; ceci donne également une procédure pour construire L_p/q de façon très simple. Enfin, nous définissons une transformation d'automate, la surminimisation, qui réduit le nombre d'états d'un automate au delà de ce que permet la minimisation classique ; en contrepartie, un automate et sa surminimisation n'acceptent pas le même langage, mais seulement des langages avec le même arbre ordonné sous-jacent
    corecore